187 research outputs found

    The TimeMachine for Inference on Stochastic Trees

    Full text link
    The simulation of genealogical trees backwards in time, from observations up to the most recent common ancestor (MRCA), is hindered by the fact that, while approaching the root of the tree, coalescent events become rarer, with a corresponding increase in computation time. The recently proposed "Time Machine" tackles this issue by stopping the simulation of the tree before reaching the MRCA and correcting for the induced bias. We present a computationally efficient implementation of this approach that exploits multithreading

    Automated calibration of consensus weighted distance-based clustering approaches using sharp

    Get PDF
    In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularised approaches. We propose a procedure for the calibration of the number of clusters (and regularisation parameter) by maximising a novel consensus score calculated directly from consensus clustering outputs, making it extremely computationally competitive. Our simulation study shows better clustering performances of (i) models calibrated by maximising our consensus score compared to existing calibration scores, and (ii) weighted compared to unweighted approaches in the presence of features that do not contribute to cluster definition. Application on real gene expression data measured in lung tissue reveals clear clusters corresponding to different lung cancer subtypes. The R package sharp (version 1.4.0) is available on CRAN

    Association between purchase of over-the-counter medications and ovarian cancer diagnosis in the Cancer Loyalty Card Study (CLOCS):observational case-control study

    Get PDF
    BACKGROUND: Over-the-counter (OTC) medications are frequently used to self-care for nonspecific ovarian cancer symptoms prior to diagnosis. Monitoring such purchases may provide an opportunity for earlier diagnosis. OBJECTIVE: The aim of the Cancer Loyalty Card Study (CLOCS) was to investigate purchases of OTC pain and indigestion medications prior to ovarian cancer diagnosis in women with and without ovarian cancer in the United Kingdom using loyalty card data. METHODS: An observational case-control study was performed comparing purchases of OTC pain and indigestion medications prior to diagnosis in women with (n=153) and without (n=120) ovarian cancer using loyalty card data from two UK-based high street retailers. Monthly purchases of pain and indigestion medications for cases and controls were compared using the Fisher exact test, conditional logistic regression, and receiver operating characteristic (ROC) curve analysis. RESULTS: Pain and indigestion medication purchases were increased among cases 8 months before diagnosis, with maximum discrimination between cases and controls 8 months before diagnosis (Fisher exact odds ratio [OR] 2.9, 95% CI 2.1-4.1). An increase in indigestion medication purchases was detected up to 9 months before diagnosis (adjusted conditional logistic regression OR 1.38, 95% CI 1.04-1.83). The ROC analysis for indigestion medication purchases showed a maximum area under the curve (AUC) at 13 months before diagnosis (AUC=0.65, 95% CI 0.57-0.73), which further improved when stratified to late-stage ovarian cancer (AUC=0.68, 95% CI 0.59-0.78). CONCLUSIONS: There is a difference in purchases of pain and indigestion medications among women with and without ovarian cancer up to 8 months before diagnosis. Facilitating earlier presentation among those who self-care for symptoms using this novel data source could improve ovarian cancer patients' options for treatment and improve survival. TRIAL REGISTRATION: ClinicalTrials.gov NCT03994653; https://clinicaltrials.gov/ct2/show/NCT03994653

    Cardiometabolic risk estimation using exposome data and machine learning

    Get PDF
    Background: The human exposome encompasses all exposures that individuals encounter throughout their lifetime. It is now widely acknowledged that health outcomes are influenced not only by genetic factors but also by the interactions between these factors and various exposures. Consequently, the exposome has emerged as a significant contributor to the overall risk of developing major diseases, such as cardiovascular disease (CVD) and diabetes. Therefore, personalized early risk assessment based on exposome attributes might be a promising tool for identifying high-risk individuals and improving disease prevention. Objective: Develop and evaluate a novel and fair machine learning (ML) model for CVD and type 2 diabetes (T2D) risk prediction based on a set of readily available exposome factors. We evaluated our model using internal and external validation groups from a multi-center cohort. To be considered fair, the model was required to demonstrate consistent performance across different sub-groups of the cohort. Methods: From the UK Biobank, we identified 5,348 and 1,534 participants who within 13 years from the baseline visit were diagnosed with CVD and T2D, respectively. An equal number of participants who did not develop these pathologies were randomly selected as the control group. 109 readily available exposure variables from six different categories (physical measures, environmental, lifestyle, mental health events, sociodemographics, and early-life factors) from the participant's baseline visit were considered. We adopted the XGBoost ensemble model to predict individuals at risk of developing the diseases. The model's performance was compared to that of an integrative ML model which is based on a set of biological, clinical, physical, and sociodemographic variables, and, additionally for CVD, to the Framingham risk score. Moreover, we assessed the proposed model for potential bias related to sex, ethnicity, and age. Lastly, we interpreted the model's results using SHAP, a state-of-the-art explainability method. Results: The proposed ML model presents a comparable performance to the integrative ML model despite using solely exposome information, achieving a ROC-AUC of 0.78±0.01 and 0.77±0.01 for CVD and T2D, respectively. Additionally, for CVD risk prediction, the exposome-based model presents an improved performance over the traditional Framingham risk score. No bias in terms of key sensitive variables was identified. Conclusions: We identified exposome factors that play an important role in identifying patients at risk of CVD and T2D, such as naps during the day, age completed full-time education, past tobacco smoking, frequency of tiredness/unenthusiasm, and current work status. Overall, this work demonstrates the potential of exposome-based machine learning as a fair CVD and T2D risk assessment tool.</p

    The use of human papillomavirus DNA methylation in cervical intraepithelial neoplasia : A systematic review and meta-analysis

    Get PDF
    Background: Methylation of viral DNA has been proposed as a novel biomarker for triage of human papillomavirus (HPV) positive women at screening. This systematic review and meta-analysis aims to assess how methylation levels change with disease severity and to determine diagnostic test accuracy (DTA) in detecting high-grade cervical intra-epithelial neoplasia (CIN). Methods: We performed searches in MEDLINE, EMBASE and CENTRAL from inception to October 2019. Studies were eligible if they explored HPV methylation levels in HPV positive women. Data were extracted in duplicate and requested from authors where necessary. Random-effects models and a bivariate mixed-effects binary regression model were applied to determine pooled effect estimates. Findings: 44 studies with 8819 high-risk HPV positive women were eligible. The pooled estimates for positive methylation rate in HPV16 L1 gene were higher for high-grade CIN (>= CIN2/high-grade squamous intra-epithelial lesion (HSIL) (95% confidence interval (95%CI:72.7% (47 8-92.2))) vs. low-grade CIN (= CIN2/HSIL vs. = CIN2/HSIL vs. Interpretation: Higher HPV methylation is associated with increased disease severity, whilst HPV16 L1/L2 genes demonstrated high diagnostic accuracy to detect high-grade CIN in HPV16 positive women. Direct clinical use is limited by the need for a multi-genotype and standardised assays. Next-generation multiplex HPV sequencing assays are under development and allow potential for rapid, automated and low-cost methylation testing. (C) 2019 The Authors. Published by Elsevier B.V.Peer reviewe

    The Cord Blood Insulin and Mitochondrial DNA Content Related Methylome

    Get PDF
    Mitochondrial dysfunction seems to play a key role in the etiology of insulin resistance. At birth, a link has already been established between mitochondrial DNA (mtDNA) content and insulin levels in cord blood. In this study, we explore shared epigenetic mechanisms of the association between mtDNA content and insulin levels, supporting the developmental origins of this link. First, the association between cord blood insulin and mtDNA content in 882 newborns of the ENVIRONAGE birth cohort was assessed. Cord blood mtDNA content was established via qPCR, while cord blood levels of insulin were determined using electrochemiluminescence immunoassays. Then the cord blood DNA methylome and transcriptome were determined in 179 newborns, using the human 450K methylation Illumina and Agilent Whole Human Genome 8 × 60 K microarrays, respectively. Subsequently, we performed an epigenome-wide association study (EWAS) adjusted for different maternal and neonatal variables. Afterward, we focused on the 20 strongest associations based on p-values to assign transcriptomic correlates and allocate corresponding pathways employing the R packages ReactomePA and RDAVIDWebService. On the regional level, we examined differential methylation using the DMRcate and Bumphunter packages in R. Cord blood mtDNA content and insulin were significantly correlated (r = 0.074, p = 0.028), still showing a trend after additional adjustment for maternal and neonatal variables (p = 0.062). We found an overlap of 33 pathways which were in common between the association with cord blood mtDNA content and insulin levels, including pathways of neurodevelopment, histone modification, cytochromes P450 (CYP)-metabolism, and biological aging. We further identified a DMR annotated to Repulsive Guidance Molecule BMP Co-Receptor A (RGMA) linked to cord blood insulin as well as mtDNA content. Metabolic variation in early life represented by neonatal insulin levels and mtDNA content might reflect or accommodate alterations in neurodevelopment, histone modification, CYP-metabolism, and aging, indicating etiological origins in epigenetic programming. Variation in metabolic hormones at birth, reflected by molecular changes, might via these alterations predispose children to metabolic diseases later in life. The results of this study may provide important markers for following targeted studies

    Creutzfeldt-Jakob disease : update

    Get PDF
    Although rare, human diseases induced by non-conventional transmissible agents (NCTA or prions) are under constant scrutiny and associated with sometimes irrational fears. This article reviews briefly the clinical, biological, neuro-imaging, genetic and neuropathology data on the different variants of Creutzfeldt-Jakob disease. The recent leads on their pathogenesis, the resulting public health challenges, the running of French surveillance networks, and the recent diagnostic and therapeutic hopes are summarised.La rareté des maladies humaines à Agents Transmissibles Non Conventionnels (ATNC ou prions) ne doit pas faire sous-estimer l'intérêt constant et les craintes, parfois irrationnelles, qu'elles entraînent. Les données concernant la clinique, la biologie, l'imagerie, la génétique et la neuropathologie des différentes variantes de la maladie de Creutzfeldt-Jakob, sont brièvement mises en perspective. Les pistes récentes concernant leur mécanisme, les nouveaux défis pour la santé publique qu'apportent ces affections, les principales mesures mises en oeuvre pour les prévenir, les modalités de fonctionnement des réseaux de surveillance français et les espoirs diagnostiques et thérapeutiques récents sont résumés

    Датчики интегральной поглощенной дозы ионизирующего излучения на основе МОП-транзисторов

    Get PDF
    Определены требования к конструкции технологии изготовления р- и n-канальных МОП-транзисторов с толстым слоем оксида, предназначенных для применения в качестве интегральных дозиметров поглощенной дозы ионизирующего излучения.Визначено вимоги до конструкції та технології виготов лення р-канальних та n-канальних МОП-транзисторів із тоѕстим шаром оксиду, призначених для вжитку як інтегральні дозиметри поглинутої дози іонізуючого випромінення. Розроблено технологію створення радіаційно-чутливих МОП-транзисторів з товстим шаром оксиду в р-канальному и в n-канальному вариантах.The requirements to technology and design of p-channel and n-channel MOS transistors with a thick oxide layer designed for use in the capacity of integral dosimeters of absorbed dose of ionizing radiation are defined. The technology of radiation-sensitive MOS transistors with a thick oxide in the p-channel and n-channel version is created

    Refinement of a Methodology for Untargeted Detection of Serum Albumin Adducts in Human Populations.

    Get PDF
    Covalently modified blood proteins (e.g., serum albumin adducts) are increasingly being viewed as potential biomarkers via which the environmental causes of human diseases may be understood. The notion that some (perhaps many) modifications have yet to be discovered has led to the development of untargeted adductomics methods, which attempt to capture entire populations of adducts. One such method is fixed-step selected reaction monitoring (FS-SRM), which analyses distributions of serum albumin adducts via shifts in the mass of a tryptic peptide [Li et al. (2011) Mol. Cell. Proteomics 10, M110.004606]. Working on the basis that FS-SRM might be able to detect biological variation due to environmental factors, we aimed to scale the methodology for use in an epidemiological setting. Development of sample preparation methods led to a batch workflow with increased throughput and provision for quality control. Challenges posed by technical and biological variation were addressed in the processing and interpretation of the data. A pilot study of 20 smokers and 20 never-smokers provided evidence of an effect of smoking on levels of putative serum albumin adducts. Differences between smokers and never-smokers were most apparent in putative adducts with net gains in mass between 105 and 114 Da (relative to unmodified albumin). The findings suggest that our implementation of FS-SRM could be useful for studying other environmental factors with relevance to human health
    corecore